This work builds on my first (and failed) attempt to estimate how many ratings one needs to sample to have a reliable measure of a certain population. Therefore, readers are advised to catch up on my previous work to be up to speed on this smaller project.
As the unconventional approach did not work, we will revert back to the more “old school” methodologies of an a priori power analysis. That is, estimating the minimum sample size needed to uncover a set effect size (or bigger), given a significance level and type II (false negatives) error.
The group has chosen to employ a 11-point likert scale instead of the previously used 101. Therefore, this work will also employ this scale.
As we have no real consensus what polarization entails, getting a numerical quantifier for the effect size seems to be out of the question. 4 preset Distributions were chosen again, each with differing grade of polarization. For readers who feel familiar with the colors from my prior simulation, most of the distributions have adopted the same spirit from its predecessor, but some changes were made:
5 on the other extreme)It is important to note that while the normal and strong polarization distribution represent no and strong effects, the small and rare distributions should both illustrate small effects in the population, and putting one over the other really depends on how to interpret/ operationalize polarization itself (e.g. asymmetry, distance, or agreement).
For this work, we will use the same sample sizes as the previous one,
consisting of 20 different sample sizes:
## [1] 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190
## [20] 200
For each sample size, we replicate the random draw 200
times. For each of the samples, the measures of bimodality coefficient
(BC), polarization and group divergence are calculated. Similar to
classification problems though, we will have to set thresholds for each
measure, where values above a threshold indicate polarization, while
those falling below indicate the absence of polarization. Luckily, the
BC already has a set threshold of \(0.\overline{5}\).
For the two other measures, I looked at the previous results, and the
measurement of polarization has a low value for the rare distribution,
even lower than the normal distribution (as it uses a weighted sum, thus
small groups are discounted). Setting a low threshold so rare
distributions are also classified as polarized will therefore also net
us too many false negatives (e.g. saying normal distributions are
polarized). I’ll set the group divergence threshold arbitrarily at
0.5 and polarization threshold at 0.5 and see
how it goes…
Again, this is how our sampled matrix looks like, adopting a staircase like shape. Using this method saves us time, as well as prevents errors.
After we’ve calculated the operationalisation measures for each of
the 16000 drawn samples, the measures were put into a data
frame. As we can see, no missing values are here, indicating that each
measure was calculated successfully.
| Population Distribution | Bimodality Coefficient | Group Divergence | Polarization |
|---|---|---|---|
| None | 0.345 | 0.287 | 0.152 |
| Small Pol. | 0.655 | 0.288 | 0.297 |
| Rare | 0.766 | 0.762 | 0.185 |
| Strong Pol. | 0.884 | 0.804 | 0.823 |
## # A tibble: 80 × 4
## # Groups: Measure, transl_risk_distr [4]
## Measure transl_risk_distr sample_size `Prop_as_polarized_in_%`
## <chr> <fct> <dbl> <dbl>
## 1 BC None 10 2
## 2 BC None 20 2
## 3 BC None 30 1.5
## 4 BC None 40 0.5
## 5 BC None 50 0
## 6 BC None 60 0
## 7 BC None 70 0
## 8 BC None 80 0
## 9 BC None 90 0
## 10 BC None 100 0
## # ℹ 70 more rows
as like in all classifications, no clear cut line between pol vs none, but gradual.
in accordance to the previous one, setting the assumption that the rare polarization is, in fact, polarized, may not hold well in the study itself, as smaller outliers, misinterpretation or even a mouse slip may contribute to such a distribution, which we would then just assume is polarized.
the treshhold were set somewhat arbitrary, and therefore, may not hold in the real deal, and cannot be generalized for further distributions, but only for the ones seen here. One might change the treshhold to find an optimized one where we would have the least false positives and false negatives, but this threshold will not generalize for other distributions. As such, I’ll refrain from finding the optimum, as it would net us not any additional insight to our cause.
only 3 measures of operatioalization was used, and thus, not every aspect of operationalization was covered.